Learning an Expert from Human Annotations in Statistical Machine Translation: the Case of Out-of-Vocabulary Words
نویسندگان
چکیده
We present a general method for incorporating an “expert” model into a Statistical Machine Translation (SMT) system, in order to improve its performance on a particular “area of expertise”, and apply this method to the specific task of finding adequate replacements for Out-of-Vocabulary (OOV) words. Candidate replacements are paraphrases and entailed phrases, obtained using monolingual resources. These candidate replacements are transformed into “dynamic biphrases”, generated at decoding time based on the context of each source sentence. Standard SMT features are enhanced with a number of new features aimed at scoring translations produced by using different replacements. Active learning is used to discriminatively train the model parameters from human assessments of the quality of translations. The learning framework yields an SMT system which is able to deal with sentences containing OOV words but also guarantees that the performance is not degraded for input sentences without OOV words. Results of experiments on English-French translation show that this method outperforms previous work addressing OOV words in terms of acceptability.
منابع مشابه
On The Effects of Literal Translation, L1 Glosses and Context, Applied in Reading Comprehension, on Iranian EFL Learners Vocabulary Learning: The Case of Different Proficiencies
This paper aims at discovering and investigating the effect of l1 translation, l1 gloss andcontext on vocabulary learning on EFL Iranian learners. A total number of 120 EFL students inprivate English institutes in Sari participated in the present study. They were divided into twoproficiency groups and three learning conditions. In order to make a list of words unknown to thelearners the partici...
متن کاملThe Effects of Multimedia Annotations on Iranian EFL Learners’ L2 Vocabulary Learning
In our modern technological world, Computer-Assisted Language learning (CALL) is a new realm towards learning a language in general, and learning L2 vocabulary in particular. It is assumed that the use of multimedia annotations promotes language learners’ vocabulary acquisition. Therefore, this study set out to investigate the effects of different multimedia annotations (still picture annotatio...
متن کاملThe Comparative Impact of Pictorial Annotations and Morphological Instruction on Lexical Inferencing of Iranian Intermediate EFL Learners
One of the main ways to acquire unfamiliar words is to make guesses about words meaning. This study investigates the comparative effects of pictorial annotations and morphological instructions on Iranian EFL learners’ lexical inferencing ability. Considering homogeneity issues using PET (Preliminary English Test), the researchers assigned the participants into two experimental and one control g...
متن کاملThe Effect of Visually-Mediated Collocations on the Elementary EFL Learners’ Vocabulary Learning
When vocabulary teaching is taken into account in EFL classes in our Iranian state primary schools, teachers generally prefer to use classical techniques. The purpose of this study was to investigate the effect of visually-mediated collocations on the elementary EFL learners’ vocabulary learning. In order to conduct this study, 60 students from two classrooms in an elementary class, participate...
متن کاملWord clustering effect on vocabulary learning of EFL learners: A case of semantic versus phonological clustering
The aim of this study is to determine the effect of word clustering method on vocabulary learning of Iranian EFL learners through a case of semantic versus phonological clustering. To this effect, 80 homogeneous students from four intermediate classes at an English institute in Torbat e Heydariyeh participated in this research. They were assigned to four groups according to semantic versus phon...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010